{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "LmfLXp5_bt-a"
      },
      "source": [
        "# Document search with embeddings"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kIkJ7zgADMlP"
      },
      "source": [
        "<table align=\"left\">\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/google-gemini/cookbook/blob/main/examples/Talk_to_documents_with_embeddings.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
        "  </td>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "bbPzgYbrwbK2"
      },
      "source": [
        "## Overview\n",
        "\n",
        "This example demonstrates how to use the Gemini API to create embeddings so that you can perform document search. You will use the Python client library to build a word embedding that allows you to compare search strings, or questions, to document contents.\n",
        "\n",
        "In this tutorial, you'll use embeddings to perform document search over a set of documents to ask questions related to the Google Car.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "IS2GUSP0_Eim"
      },
      "source": [
        "## Setup"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "YD6urJjWGVDf"
      },
      "outputs": [],
      "source": [
        "!pip install -U -q google.generativeai"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Zey3UiYGDDzU"
      },
      "outputs": [],
      "source": [
        "import textwrap\n",
        "import numpy as np\n",
        "import pandas as pd\n",
        "\n",
        "import google.generativeai as genai\n",
        "import google.ai.generativelanguage as glm\n",
        "\n",
        "# Used to securely store your API key\n",
        "from google.colab import userdata\n",
        "\n",
        "from IPython.display import Markdown"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DJriBaWmkL6Z"
      },
      "source": [
        "To run the following cell, your API key must be stored it in a Colab Secret named `GOOGLE_API_KEY`. If you don't already have an API key, or you're not sure how to create a Colab Secret, see the [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) quickstart for an example."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "JIm3gEGYhTX1"
      },
      "outputs": [],
      "source": [
        "GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')\n",
        "genai.configure(api_key=GOOGLE_API_KEY)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "gGpQ8Eg0kNXW"
      },
      "source": [
        "## Embedding generation\n",
        "\n",
        "In this section, you will see how to generate embeddings for a piece of text using the embeddings from the Gemini API.\n",
        "\n",
        "See the [Embeddings quickstart](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Embeddings.ipynb) to learn more about the `task_type` parameter used below."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "J76TNa3QDwCc"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "{'embedding': [0.03411343, -0.05517662, -0.020209055, -0.0041249567, 0.058917783, 0.014129515, 0.0045353593, 0.0014303668, 0.05976634, 0.08292115, 0.007162964, 0.0069041685, -0.053083427, -0.010905125, 0.0321402, -0.037163995, 0.050372455, 0.019348344, -0.037328612, 0.026647927, 0.030781753, -0.011288501, -0.031485256, -0.060248993, -0.026219442, -0.009794235, 0.006630139, -0.01846516, -0.026324715, 0.020442624, -0.06317684, 0.014559574, -0.052296035, 0.016451128, -9.720217e-05, -0.051706687, -0.0054406044, -0.056967627, 0.011144145, -0.009201792, -0.0021951047, -0.1099701, -0.011712193, 0.021221714, 0.009171804, -0.029621972, 0.034534883, 0.039578073, 0.019021519, -0.06269169, 0.039473332, 0.052403256, 0.061814185, -0.034507945, -0.009557816, -0.0049551064, 0.017839009, -0.021176832, 0.015043588, 0.015390569, -0.006334281, 0.043696404, -0.028341983, 0.028433999, 0.01472686, -0.06585564, -0.044533554, 0.0055523133, 0.035775978, 0.031099156, 0.027357662, 0.028062241, 0.056972917, -0.054656833, -0.027864764, -0.15486294, -0.027930057, 0.043678433, 0.008391214, 0.020209847, 0.002841071, -0.07201404, -0.05025868, -0.034896467, -0.030400582, 0.016623711, -0.050455835, -0.025557702, 0.0050540236, 0.032266915, -0.018223321, -0.04913693, 0.07667526, -0.03066128, -0.0127946865, 0.107169494, -0.0563475, -0.016773727, -0.010336115, -0.05220995, -0.022049127, 0.00478732, -0.039094422, 0.015671168, 0.041542538, 0.016112784, -0.022650082, 0.002988097, -0.061147556, 0.06630078, -0.057244215, -0.013767544, -0.003466806, -0.053994596, 0.04230463, -0.029314812, 0.021347178, 0.04522084, 0.0072983643, 0.0247336, 0.020755325, 0.025620919, 0.021721177, 0.008178421, -0.063603185, 0.025854606, -0.037521806, 0.020877894, 0.033131972, 0.030288944, 0.0033810628, -0.048698667, -0.027295412, 0.05622969, 0.029634966, 0.029705161, 0.010602056, -0.02217137, 0.03195607, 0.047208548, -0.013620148, 0.038463745, 0.0052672145, -0.024868716, -0.0071725682, 0.069668904, -0.092124775, 0.014151632, 0.0057337824, -0.006012909, -0.035946254, 0.0334855, -0.07426327, 0.033595767, 0.033740804, 0.0394573, -0.048899952, 0.06265119, 0.0028897377, 0.0039847936, 0.0329678, -0.012809373, 0.050108086, 0.009314281, -0.019499086, -0.048860364, -0.015204039, 0.008016007, 0.015667893, 0.03903864, 0.011732032, 0.034669068, -0.01226258, -0.052623957, 0.006695629, -0.05849198, 0.00101155, -0.009621011, -0.0052014636, -0.020959012, -0.02466722, -0.03861565, 0.049754824, 0.048655674, 0.0044479654, -0.020404046, 0.101043485, -0.022594253, -0.06822699, 0.044780556, -0.03859346, -0.015194885, -0.0059435116, -0.016267126, 0.0012126336, 0.054198146, 0.01978253, 0.02905382, 0.034172967, -0.0032252679, 0.020003818, 0.07212547, 0.035888623, -0.00029856138, 0.0044168616, 0.036989234, 0.100975856, 0.0048228516, -0.04405796, 0.00039434276, -0.044601627, -0.011658614, 0.03398768, 0.02250937, 0.034583274, -0.03440395, -0.003274625, -0.005927225, -0.007679341, -0.025777208, -0.02205426, 0.00823437, -0.027172998, -0.015607741, -0.022958823, 0.098416075, -0.045472592, 0.031623535, 0.030663209, -0.03987397, 0.0048750523, 0.057770126, 0.04547866, 0.009881574, 0.044948515, 0.012011639, 0.003141497, 0.0016209317, 0.07142094, 0.025111957, -0.049478546, 0.052195616, 0.041401174, 0.0380032, -0.05878786, -0.007194873, -0.015402912, 0.048243146, 0.025205499, 0.051020827, -0.030305905, -0.031656887, -0.008994425, 0.039839912, -0.043015696, 0.008373317, -0.089018084, -0.045301914, -0.0074205245, 0.0049243467, 0.060365975, -0.06462967, 0.00815101, -0.020417998, 0.00030822973, -0.039288856, 0.04017253, 0.03137731, -0.031728875, 0.03444872, -0.031234143, -0.048502136, 0.033941768, 0.034225147, -0.008299359, 0.033098515, -0.012317135, 0.014448822, 0.06187389, -0.059683096, 0.0012899865, 0.007460227, 0.02652167, -0.07248658, -0.05818953, 0.030052334, -0.015347992, -0.035913672, -0.034901086, -0.0661791, -0.055562418, 0.0130468095, -0.0035763406, -0.0086615095, -0.046888705, -0.005326655, -0.021710252, 0.072175056, 0.038597208, -0.038364347, -0.005039459, -0.07634857, -0.045539834, -0.07372115, 0.018378831, -0.032071035, -0.030269828, -0.044599615, 0.05132602, 0.04769642, 0.0014855171, -0.028435005, -0.0016777773, 0.0072506564, 0.08448479, 0.04224268, -0.029304115, 0.02165559, 0.0056837583, 0.06214723, 0.0028552667, 0.015904495, -0.016737062, -0.0040876116, -0.037475582, 0.04675168, -0.052556172, -0.016293239, -0.014435561, 0.022127734, -0.0052535324, 0.0190588, -0.011537659, 0.0484614, 0.028816467, 0.024607794, -0.043762755, 0.011608192, 0.0021703655, -0.045297306, -0.0048169023, -0.0071430723, 0.011705694, -0.05006429, 0.029933244, -0.020802287, -0.07580785, -0.012268235, 0.07616304, 0.006857885, 0.01853518, 0.043729052, -0.032221675, -0.010121849, -0.019831154, -0.032731093, 0.051531196, 0.0024111927, 0.090960525, -0.036896333, 0.035708647, 0.03678696, 0.00832481, 0.001757778, 0.04144341, 0.042203393, 0.0045936033, 0.021837635, 0.0066275136, 0.0069022025, -0.008452904, -0.03277543, 0.0061044246, -0.02500629, 0.012441071, 0.018081166, -0.06230864, -0.040046707, -0.019351328, 0.0007255696, 0.002202931, -0.041990966, 0.023313772, 0.039377946, -0.0012839311, 0.010378518, 0.0025737497, 0.043841247, -0.0067742136, 0.045794934, 0.01388272, 0.032243907, 0.0919292, 0.03760722, 0.0060486114, 0.010843367, 0.001991803, -0.04838942, -0.006412631, -0.030764624, -0.015602797, -0.048885867, -0.015245706, -0.0006477355, 0.013608845, 0.0040335134, -0.0015530499, -0.008402027, -0.05728556, -0.027370622, 0.019342335, -0.039145477, 0.049000833, -0.052876346, -0.060248777, 0.009484413, 0.011271402, -0.0019944878, -0.013369263, -0.0130786, 0.0050903647, -0.0003995775, 0.04580157, -0.030488051, -0.07777237, 0.022998745, -0.007693635, -0.013473893, 0.0071830116, 0.014312745, 0.019949466, 0.034036275, -0.0011623668, 0.022655929, -0.0049825236, -0.036455333, 0.0033899196, 0.020583669, -0.010457001, 0.027299065, 0.034606297, -0.0111719165, -0.013660416, -0.02705466, -0.05144293, -0.07396907, -0.022817062, -0.0064836126, 0.037774086, -0.06774259, 0.016620712, -0.046481006, -0.030288063, -0.055035893, -0.015402408, -0.014477583, 0.0024700973, 0.024081903, -0.008900536, -0.0032105052, 0.026591286, -0.027869076, -0.014552753, -0.026460772, 0.06831125, -0.019622969, -0.028588912, -0.02271201, 0.0019694276, 0.0079966, -0.013207389, -0.07246265, -0.005246626, -0.03556684, 0.014131167, 0.0018361827, -0.084728725, 0.010380415, -0.038140625, 0.0066234693, -0.023485202, 0.05133969, 0.018931301, -0.0077241925, -0.01968148, -0.0615474, 0.036711216, 0.028462604, -0.02205502, 0.02294784, 0.03529192, 0.044653304, -0.029656367, -0.04243813, -0.024271922, 0.008206945, -0.015324323, 0.028326686, 0.0708875, 0.03499979, -0.04111004, -0.02691298, -0.011054021, 0.035632536, 0.057256706, -0.058149684, 0.022313014, -0.03727344, 0.0095027555, -0.0325091, -0.007395906, 0.009455788, 0.0053972304, -0.028935568, 0.054196633, -0.051867362, -0.010642803, 0.034427024, 0.04308132, 0.020671992, 0.068610825, 0.018303277, -0.08433639, 0.0023544622, -0.009237108, -0.0410166, 0.012912618, -0.035220295, 0.032994937, -0.0063333404, -0.028377546, 0.05429965, -0.022590995, -0.033762764, -0.0061482205, 0.0014308131, 0.05402618, -0.030298075, -0.020893354, 0.04020406, -0.013849863, -0.047842298, 0.032006662, 0.037729368, -0.02878951, 0.002758488, -0.0023380243, -0.052403864, 0.021707276, -0.02718091, 0.0045513017, 0.02493268, -0.016037108, 0.009521465, 0.022595555, -0.03332406, -0.01791281, -0.026219219, 0.015336862, 0.018615942, 0.0014700901, 0.005194217, -0.0059983027, -0.002134208, 0.055935774, 0.0002028429, -0.01381741, 0.0005677742, 0.052481145, -0.0056857914, -0.024219796, -0.0074823913, 0.041230515, 0.005571935, 0.06841099, -0.025634678, -0.037456885, -0.0021465495, -0.05163424, 0.048833348, 0.057269894, -0.0017718605, -0.012836743, 0.054180846, -0.032427873, -0.003244846, 0.01254491, 0.0071952185, 0.02080726, 0.015043071, -0.08000574, 0.047099367, -0.009071923, 0.022494175, -0.007407801, -0.018199192, 0.01923855, -0.016820459, 0.026590073, 0.05919531, -0.015211094, -0.051043298, 0.05085604, -0.027980763, -0.01785205, 0.05260401, 0.0039136643, -0.010834236, 0.015846392, 0.011993318, 0.0085244095, -0.09705911, 0.004848937, -0.03151453, -0.049902532, -0.023312563, 0.0169847, 0.051852323, -0.018586924, -0.011750037, 0.020324359, 0.041236103, 0.046270456, 0.045885824, 0.035645086, 0.027820086, -0.054944187, -0.0018159872, 0.06008568, 0.056207847, 0.03509413, 0.07476336, 0.00042090056, -0.01791933, -0.049269866, 0.013118644, 0.03817175, 0.03985353, -0.023338122, -0.05917611, -0.040447813, -0.014515073, -0.01641867, -0.012444603, -0.015801677, 0.01694387, -0.012097041, -0.10444289, -0.044068433, 0.028175205, 0.0032158983, 0.017225135, 0.024197249, 0.0003871886, 0.008296747, 0.0020322825, -0.06488942, -0.028532177, 0.03631236, -0.021784041, -0.028676897, 0.020023972, -0.015093374, -0.0053404626, -0.035407133, -0.03022746, -0.045240995, -0.089037456, -0.05241791, 0.01601896, -0.058039088, 0.06633133, 0.01435994, 0.0024608225, 0.02044063, 0.049869247, 0.013966787, 0.011062478, 0.023516618, 0.010368709, 0.039040443, -0.03096598, -0.01665127, 0.010691767, -0.0089797005, 0.018564576, 0.03291386, 0.0032383145, -0.00884169, -0.008645399, 0.0001677955, -0.04452774, 0.007207213, -0.008696507, 0.0023566217, -0.025329702, -0.042708885, -0.03173582, 0.06427912, 0.030916397, -0.022305708, -0.018711232, -0.008136281, -0.01636213, 0.019092057, 0.010243902, -0.04405114, 0.018331835, -0.025844995, 0.035896596, 0.049257137, -0.053962618, -0.084952496, -0.009314442, -0.03644633, 0.0010881334, -0.042904764, 0.016017154, -0.011390375, 0.056498464, 0.007735383, 0.015750613, 0.023586866, -0.005065194, -0.05339934, 0.030084236, -0.021841932, -0.0035868485, -0.025362536, 0.0315042, 0.039552346, -0.032164883, -0.03519624, -0.013936666, 0.006526046, 0.02818671, -0.018081086, 0.04806136, -0.04418975, -0.064630605, -0.010125073, -0.02926605, 0.022641547, 0.040159058, 0.022463534, -0.04924557, -0.010198766, -0.019940902, -0.0033762371, -0.07010838, -0.031799905, -0.020567331, -0.015259151, 0.04870838, 0.030047685, -0.016861487, 0.020778332, -0.034649372, -0.0026895248, -0.0053685517, -0.03297844, -0.0048753927, -0.005587019, -0.041837722, 0.0161564, 0.072810896, -0.043315165, 0.03330332]}\n"
          ]
        }
      ],
      "source": [
        "title = \"The next generation of AI for developers and Google Workspace\"\n",
        "sample_text = (\"Title: The next generation of AI for developers and Google Workspace\"\n",
        "    \"\\n\"\n",
        "    \"Full article:\\n\"\n",
        "    \"\\n\"\n",
        "    \"Gemini API & Google AI Studio: An approachable way to explore and prototype with generative AI applications\")\n",
        "\n",
        "model = 'models/embedding-001'\n",
        "embedding = genai.embed_content(model=model,\n",
        "                                content=sample_text,\n",
        "                                task_type=\"retrieval_document\",\n",
        "                                title=title)\n",
        "\n",
        "print(embedding)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "dD1lQx3Zr3S2"
      },
      "source": [
        "## Building an embeddings database\n",
        "\n",
        "Here are three sample texts to use to build the embeddings database. You will use the Gemini API to create embeddings of each of the documents. Turn them into a dataframe for better visualization."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "XvLRIbpq4vNN"
      },
      "outputs": [],
      "source": [
        "DOCUMENT1 = {\n",
        "    \"title\": \"Operating the Climate Control System\",\n",
        "    \"content\": \"Your Googlecar has a climate control system that allows you to adjust the temperature and airflow in the car. To operate the climate control system, use the buttons and knobs located on the center console.  Temperature: The temperature knob controls the temperature inside the car. Turn the knob clockwise to increase the temperature or counterclockwise to decrease the temperature. Airflow: The airflow knob controls the amount of airflow inside the car. Turn the knob clockwise to increase the airflow or counterclockwise to decrease the airflow. Fan speed: The fan speed knob controls the speed of the fan. Turn the knob clockwise to increase the fan speed or counterclockwise to decrease the fan speed. Mode: The mode button allows you to select the desired mode. The available modes are: Auto: The car will automatically adjust the temperature and airflow to maintain a comfortable level. Cool: The car will blow cool air into the car. Heat: The car will blow warm air into the car. Defrost: The car will blow warm air onto the windshield to defrost it.\"}\n",
        "DOCUMENT2 = {\n",
        "    \"title\": \"Touchscreen\",\n",
        "    \"content\": \"Your Googlecar has a large touchscreen display that provides access to a variety of features, including navigation, entertainment, and climate control. To use the touchscreen display, simply touch the desired icon.  For example, you can touch the \\\"Navigation\\\" icon to get directions to your destination or touch the \\\"Music\\\" icon to play your favorite songs.\"}\n",
        "DOCUMENT3 = {\n",
        "    \"title\": \"Shifting Gears\",\n",
        "    \"content\": \"Your Googlecar has an automatic transmission. To shift gears, simply move the shift lever to the desired position.  Park: This position is used when you are parked. The wheels are locked and the car cannot move. Reverse: This position is used to back up. Neutral: This position is used when you are stopped at a light or in traffic. The car is not in gear and will not move unless you press the gas pedal. Drive: This position is used to drive forward. Low: This position is used for driving in snow or other slippery conditions.\"}\n",
        "\n",
        "documents = [DOCUMENT1, DOCUMENT2, DOCUMENT3]"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "WwhCQwPbvwc-"
      },
      "source": [
        "Organize the contents of the dictionary into a dataframe for better visualization."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "GJKLIW9Z31Vf"
      },
      "outputs": [
        {
          "data": {
            "application/vnd.google.colaboratory.intrinsic+json": {
              "summary": "{\n  \"name\": \"df\",\n  \"rows\": 3,\n  \"fields\": [\n    {\n      \"column\": \"Title\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 3,\n        \"samples\": [\n          \"Operating the Climate Control System\",\n          \"Touchscreen\",\n          \"Shifting Gears\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    },\n    {\n      \"column\": \"Text\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 3,\n        \"samples\": [\n          \"Your Googlecar has a climate control system that allows you to adjust the temperature and airflow in the car. To operate the climate control system, use the buttons and knobs located on the center console.  Temperature: The temperature knob controls the temperature inside the car. Turn the knob clockwise to increase the temperature or counterclockwise to decrease the temperature. Airflow: The airflow knob controls the amount of airflow inside the car. Turn the knob clockwise to increase the airflow or counterclockwise to decrease the airflow. Fan speed: The fan speed knob controls the speed of the fan. Turn the knob clockwise to increase the fan speed or counterclockwise to decrease the fan speed. Mode: The mode button allows you to select the desired mode. The available modes are: Auto: The car will automatically adjust the temperature and airflow to maintain a comfortable level. Cool: The car will blow cool air into the car. Heat: The car will blow warm air into the car. Defrost: The car will blow warm air onto the windshield to defrost it.\",\n          \"Your Googlecar has a large touchscreen display that provides access to a variety of features, including navigation, entertainment, and climate control. To use the touchscreen display, simply touch the desired icon.  For example, you can touch the \\\"Navigation\\\" icon to get directions to your destination or touch the \\\"Music\\\" icon to play your favorite songs.\",\n          \"Your Googlecar has an automatic transmission. To shift gears, simply move the shift lever to the desired position.  Park: This position is used when you are parked. The wheels are locked and the car cannot move. Reverse: This position is used to back up. Neutral: This position is used when you are stopped at a light or in traffic. The car is not in gear and will not move unless you press the gas pedal. Drive: This position is used to drive forward. Low: This position is used for driving in snow or other slippery conditions.\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    }\n  ]\n}",
              "type": "dataframe",
              "variable_name": "df"
            },
            "text/html": [
              "\n",
              "  <div id=\"df-b7b8a293-4568-479b-8ec2-95e443d5bca3\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>Title</th>\n",
              "      <th>Text</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>Operating the Climate Control System</td>\n",
              "      <td>Your Googlecar has a climate control system th...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>Touchscreen</td>\n",
              "      <td>Your Googlecar has a large touchscreen display...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Shifting Gears</td>\n",
              "      <td>Your Googlecar has an automatic transmission. ...</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-b7b8a293-4568-479b-8ec2-95e443d5bca3')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-b7b8a293-4568-479b-8ec2-95e443d5bca3 button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-b7b8a293-4568-479b-8ec2-95e443d5bca3');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-bfe56492-1ebc-4b9e-b198-609718550457\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-bfe56492-1ebc-4b9e-b198-609718550457')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-bfe56492-1ebc-4b9e-b198-609718550457 button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "\n",
              "  <div id=\"id_a63dc46d-57bc-4978-b105-926af5fa9886\">\n",
              "    <style>\n",
              "      .colab-df-generate {\n",
              "        background-color: #E8F0FE;\n",
              "        border: none;\n",
              "        border-radius: 50%;\n",
              "        cursor: pointer;\n",
              "        display: none;\n",
              "        fill: #1967D2;\n",
              "        height: 32px;\n",
              "        padding: 0 0 0 0;\n",
              "        width: 32px;\n",
              "      }\n",
              "\n",
              "      .colab-df-generate:hover {\n",
              "        background-color: #E2EBFA;\n",
              "        box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "        fill: #174EA6;\n",
              "      }\n",
              "\n",
              "      [theme=dark] .colab-df-generate {\n",
              "        background-color: #3B4455;\n",
              "        fill: #D2E3FC;\n",
              "      }\n",
              "\n",
              "      [theme=dark] .colab-df-generate:hover {\n",
              "        background-color: #434B5C;\n",
              "        box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "        filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "        fill: #FFFFFF;\n",
              "      }\n",
              "    </style>\n",
              "    <button class=\"colab-df-generate\" onclick=\"generateWithVariable('df')\"\n",
              "            title=\"Generate code using this dataframe.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M7,19H8.4L18.45,9,17,7.55,7,17.6ZM5,21V16.75L18.45,3.32a2,2,0,0,1,2.83,0l1.4,1.43a1.91,1.91,0,0,1,.58,1.4,1.91,1.91,0,0,1-.58,1.4L9.25,21ZM18.45,9,17,7.55Zm-12,3A5.31,5.31,0,0,0,4.9,8.1,5.31,5.31,0,0,0,1,6.5,5.31,5.31,0,0,0,4.9,4.9,5.31,5.31,0,0,0,6.5,1,5.31,5.31,0,0,0,8.1,4.9,5.31,5.31,0,0,0,12,6.5,5.46,5.46,0,0,0,6.5,12Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "    <script>\n",
              "      (() => {\n",
              "      const buttonEl =\n",
              "        document.querySelector('#id_a63dc46d-57bc-4978-b105-926af5fa9886 button.colab-df-generate');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      buttonEl.onclick = () => {\n",
              "        google.colab.notebook.generateWithVariable('df');\n",
              "      }\n",
              "      })();\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "    </div>\n",
              "  </div>\n"
            ],
            "text/plain": [
              "                                  Title  \\\n",
              "0  Operating the Climate Control System   \n",
              "1                           Touchscreen   \n",
              "2                        Shifting Gears   \n",
              "\n",
              "                                                Text  \n",
              "0  Your Googlecar has a climate control system th...  \n",
              "1  Your Googlecar has a large touchscreen display...  \n",
              "2  Your Googlecar has an automatic transmission. ...  "
            ]
          },
          "execution_count": 8,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "df = pd.DataFrame(documents)\n",
        "df.columns = ['Title', 'Text']\n",
        "df"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "LHonPYEwStLB"
      },
      "source": [
        "Get the embeddings for each of these bodies of text. Add this information to the dataframe."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "4SOhy0lNBhfN"
      },
      "outputs": [
        {
          "data": {
            "application/vnd.google.colaboratory.intrinsic+json": {
              "summary": "{\n  \"name\": \"df\",\n  \"rows\": 3,\n  \"fields\": [\n    {\n      \"column\": \"Title\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 3,\n        \"samples\": [\n          \"Operating the Climate Control System\",\n          \"Touchscreen\",\n          \"Shifting Gears\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    },\n    {\n      \"column\": \"Text\",\n      \"properties\": {\n        \"dtype\": \"string\",\n        \"num_unique_values\": 3,\n        \"samples\": [\n          \"Your Googlecar has a climate control system that allows you to adjust the temperature and airflow in the car. To operate the climate control system, use the buttons and knobs located on the center console.  Temperature: The temperature knob controls the temperature inside the car. Turn the knob clockwise to increase the temperature or counterclockwise to decrease the temperature. Airflow: The airflow knob controls the amount of airflow inside the car. Turn the knob clockwise to increase the airflow or counterclockwise to decrease the airflow. Fan speed: The fan speed knob controls the speed of the fan. Turn the knob clockwise to increase the fan speed or counterclockwise to decrease the fan speed. Mode: The mode button allows you to select the desired mode. The available modes are: Auto: The car will automatically adjust the temperature and airflow to maintain a comfortable level. Cool: The car will blow cool air into the car. Heat: The car will blow warm air into the car. Defrost: The car will blow warm air onto the windshield to defrost it.\",\n          \"Your Googlecar has a large touchscreen display that provides access to a variety of features, including navigation, entertainment, and climate control. To use the touchscreen display, simply touch the desired icon.  For example, you can touch the \\\"Navigation\\\" icon to get directions to your destination or touch the \\\"Music\\\" icon to play your favorite songs.\",\n          \"Your Googlecar has an automatic transmission. To shift gears, simply move the shift lever to the desired position.  Park: This position is used when you are parked. The wheels are locked and the car cannot move. Reverse: This position is used to back up. Neutral: This position is used when you are stopped at a light or in traffic. The car is not in gear and will not move unless you press the gas pedal. Drive: This position is used to drive forward. Low: This position is used for driving in snow or other slippery conditions.\"\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    },\n    {\n      \"column\": \"Embeddings\",\n      \"properties\": {\n        \"dtype\": \"object\",\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    }\n  ]\n}",
              "type": "dataframe",
              "variable_name": "df"
            },
            "text/html": [
              "\n",
              "  <div id=\"df-47129e4e-4d97-4ad2-b6c9-fcf7753446f8\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>Title</th>\n",
              "      <th>Text</th>\n",
              "      <th>Embeddings</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>Operating the Climate Control System</td>\n",
              "      <td>Your Googlecar has a climate control system th...</td>\n",
              "      <td>[-0.033361107, -0.021217084, -0.049581926, -0....</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>Touchscreen</td>\n",
              "      <td>Your Googlecar has a large touchscreen display...</td>\n",
              "      <td>[0.009660736, -0.030662702, -0.017281422, -0.0...</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>Shifting Gears</td>\n",
              "      <td>Your Googlecar has an automatic transmission. ...</td>\n",
              "      <td>[-0.04270796, -0.007160868, -0.03242516, -0.02...</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-47129e4e-4d97-4ad2-b6c9-fcf7753446f8')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-47129e4e-4d97-4ad2-b6c9-fcf7753446f8 button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-47129e4e-4d97-4ad2-b6c9-fcf7753446f8');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-85deb16d-1dac-4f54-ba86-7882685fb3ae\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-85deb16d-1dac-4f54-ba86-7882685fb3ae')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-85deb16d-1dac-4f54-ba86-7882685fb3ae button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "\n",
              "  <div id=\"id_17bba104-e594-45b8-a083-0b6cacd2a8ac\">\n",
              "    <style>\n",
              "      .colab-df-generate {\n",
              "        background-color: #E8F0FE;\n",
              "        border: none;\n",
              "        border-radius: 50%;\n",
              "        cursor: pointer;\n",
              "        display: none;\n",
              "        fill: #1967D2;\n",
              "        height: 32px;\n",
              "        padding: 0 0 0 0;\n",
              "        width: 32px;\n",
              "      }\n",
              "\n",
              "      .colab-df-generate:hover {\n",
              "        background-color: #E2EBFA;\n",
              "        box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "        fill: #174EA6;\n",
              "      }\n",
              "\n",
              "      [theme=dark] .colab-df-generate {\n",
              "        background-color: #3B4455;\n",
              "        fill: #D2E3FC;\n",
              "      }\n",
              "\n",
              "      [theme=dark] .colab-df-generate:hover {\n",
              "        background-color: #434B5C;\n",
              "        box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "        filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "        fill: #FFFFFF;\n",
              "      }\n",
              "    </style>\n",
              "    <button class=\"colab-df-generate\" onclick=\"generateWithVariable('df')\"\n",
              "            title=\"Generate code using this dataframe.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "       width=\"24px\">\n",
              "    <path d=\"M7,19H8.4L18.45,9,17,7.55,7,17.6ZM5,21V16.75L18.45,3.32a2,2,0,0,1,2.83,0l1.4,1.43a1.91,1.91,0,0,1,.58,1.4,1.91,1.91,0,0,1-.58,1.4L9.25,21ZM18.45,9,17,7.55Zm-12,3A5.31,5.31,0,0,0,4.9,8.1,5.31,5.31,0,0,0,1,6.5,5.31,5.31,0,0,0,4.9,4.9,5.31,5.31,0,0,0,6.5,1,5.31,5.31,0,0,0,8.1,4.9,5.31,5.31,0,0,0,12,6.5,5.46,5.46,0,0,0,6.5,12Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "    <script>\n",
              "      (() => {\n",
              "      const buttonEl =\n",
              "        document.querySelector('#id_17bba104-e594-45b8-a083-0b6cacd2a8ac button.colab-df-generate');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      buttonEl.onclick = () => {\n",
              "        google.colab.notebook.generateWithVariable('df');\n",
              "      }\n",
              "      })();\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "    </div>\n",
              "  </div>\n"
            ],
            "text/plain": [
              "                                  Title  \\\n",
              "0  Operating the Climate Control System   \n",
              "1                           Touchscreen   \n",
              "2                        Shifting Gears   \n",
              "\n",
              "                                                Text  \\\n",
              "0  Your Googlecar has a climate control system th...   \n",
              "1  Your Googlecar has a large touchscreen display...   \n",
              "2  Your Googlecar has an automatic transmission. ...   \n",
              "\n",
              "                                          Embeddings  \n",
              "0  [-0.033361107, -0.021217084, -0.049581926, -0....  \n",
              "1  [0.009660736, -0.030662702, -0.017281422, -0.0...  \n",
              "2  [-0.04270796, -0.007160868, -0.03242516, -0.02...  "
            ]
          },
          "execution_count": 9,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "# Get the embeddings of each text and add to an embeddings column in the dataframe\n",
        "def embed_fn(title, text):\n",
        "  return genai.embed_content(model=model,\n",
        "                             content=text,\n",
        "                             task_type=\"retrieval_document\",\n",
        "                             title=title)[\"embedding\"]\n",
        "\n",
        "df['Embeddings'] = df.apply(lambda row: embed_fn(row['Title'], row['Text']), axis=1)\n",
        "df"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "cfm8a31FKd00"
      },
      "source": [
        "## Document search with Q&A\n",
        "\n",
        "Now that the embeddings are generated, let's create a Q&A system to search these documents. You will ask a question about hyperparameter tuning, create an embedding of the question, and compare it against the collection of embeddings in the dataframe.\n",
        "\n",
        "The embedding of the question will be a vector (list of float values), which will be compared against the vector of the documents using the dot product. This vector returned from the API is already normalized. The dot product represents the similarity in direction between two vectors.\n",
        "\n",
        "The values of the dot product can range between -1 and 1, inclusive. If the dot product between two vectors is 1, then the vectors are in the same direction. If the dot product value is 0, then these vectors are orthogonal, or unrelated, to each other. Lastly, if the dot product is -1, then the vectors point in the opposite direction and are not similar to each other.\n",
        "\n",
        "Note, with the new embeddings model (`embedding-001`), specify the task type as `QUERY` for user query and `DOCUMENT` when embedding a document text.\n",
        "\n",
        "Task Type | Description\n",
        "---       | ---\n",
        "RETRIEVAL_QUERY\t| Specifies the given text is a query in a search/retrieval setting.\n",
        "RETRIEVAL_DOCUMENT | Specifies the given text is a document in a search/retrieval setting."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "80w2VQQ9JWcU"
      },
      "outputs": [],
      "source": [
        "query = \"How do you shift gears in the Google car?\"\n",
        "model = 'models/embedding-001'\n",
        "\n",
        "request = genai.embed_content(model=model,\n",
        "                              content=query,\n",
        "                              task_type=\"retrieval_query\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "iivgDQej5Agt"
      },
      "source": [
        "Use the `find_best_passage` function to calculate the dot products, and then sort the dataframe from the largest to smallest dot product value to retrieve the relevant passage out of the database."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "am36P3J9M6Zv"
      },
      "outputs": [],
      "source": [
        "def find_best_passage(query, dataframe):\n",
        "  \"\"\"\n",
        "  Compute the distances between the query and each document in the dataframe\n",
        "  using the dot product.\n",
        "  \"\"\"\n",
        "  query_embedding = genai.embed_content(model=model,\n",
        "                                        content=query,\n",
        "                                        task_type=\"retrieval_query\")\n",
        "  dot_products = np.dot(np.stack(dataframe['Embeddings']), query_embedding[\"embedding\"])\n",
        "  idx = np.argmax(dot_products)\n",
        "  return dataframe.iloc[idx]['Text'] # Return text from index with max value"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Uq-bpLZm9DKo"
      },
      "source": [
        "View the most relevant document from the database:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "1I5lAqdH9zWL"
      },
      "outputs": [
        {
          "data": {
            "application/vnd.google.colaboratory.intrinsic+json": {
              "type": "string"
            },
            "text/plain": [
              "'Your Googlecar has an automatic transmission. To shift gears, simply move the shift lever to the desired position.  Park: This position is used when you are parked. The wheels are locked and the car cannot move. Reverse: This position is used to back up. Neutral: This position is used when you are stopped at a light or in traffic. The car is not in gear and will not move unless you press the gas pedal. Drive: This position is used to drive forward. Low: This position is used for driving in snow or other slippery conditions.'"
            ]
          },
          "execution_count": 12,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "passage = find_best_passage(query, df)\n",
        "passage"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ebkGT0ha5Ln3"
      },
      "source": [
        "## Question and Answering Application\n",
        "\n",
        "Let's try to use the text generation API to create a Q & A system. Input your own custom data below to create a simple question and answering example. You will still use the dot product as a metric of similarity."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "pqf-OsT3auTm"
      },
      "outputs": [],
      "source": [
        "def make_prompt(query, relevant_passage):\n",
        "  escaped = relevant_passage.replace(\"'\", \"\").replace('\"', \"\").replace(\"\\n\", \" \")\n",
        "  prompt = textwrap.dedent(\"\"\"You are a helpful and informative bot that answers questions using text from the reference passage included below. \\\n",
        "  Be sure to respond in a complete sentence, being comprehensive, including all relevant background information. \\\n",
        "  However, you are talking to a non-technical audience, so be sure to break down complicated concepts and \\\n",
        "  strike a friendly and converstional tone. \\\n",
        "  If the passage is irrelevant to the answer, you may ignore it.\n",
        "  QUESTION: '{query}'\n",
        "  PASSAGE: '{relevant_passage}'\n",
        "\n",
        "    ANSWER:\n",
        "  \"\"\").format(query=query, relevant_passage=escaped)\n",
        "\n",
        "  return prompt"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "mlpDRG3cVvQE"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "You are a helpful and informative bot that answers questions using text from the reference passage included below.   Be sure to respond in a complete sentence, being comprehensive, including all relevant background information.   However, you are talking to a non-technical audience, so be sure to break down complicated concepts and   strike a friendly and converstional tone.   If the passage is irrelevant to the answer, you may ignore it.\n",
            "  QUESTION: 'How do you shift gears in the Google car?'\n",
            "  PASSAGE: 'Your Googlecar has an automatic transmission. To shift gears, simply move the shift lever to the desired position.  Park: This position is used when you are parked. The wheels are locked and the car cannot move. Reverse: This position is used to back up. Neutral: This position is used when you are stopped at a light or in traffic. The car is not in gear and will not move unless you press the gas pedal. Drive: This position is used to drive forward. Low: This position is used for driving in snow or other slippery conditions.'\n",
            "\n",
            "    ANSWER:\n",
            "\n"
          ]
        }
      ],
      "source": [
        "prompt = make_prompt(query, passage)\n",
        "print(prompt)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qmdYdoIHcEc_"
      },
      "source": [
        "Choose one of the Gemini content generation models in order to find the answer to your query."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "B3fDj-jv5Sq_"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "models/gemini-1.0-pro\n",
            "models/gemini-1.0-pro-001\n",
            "models/gemini-1.0-pro-latest\n",
            "models/gemini-1.0-pro-vision-latest\n",
            "models/gemini-1.0-ultra-latest\n",
            "models/gemini-1.5-pro-eval\n",
            "models/gemini-pro\n",
            "models/gemini-pro-vision\n",
            "models/gemini-ultra\n"
          ]
        }
      ],
      "source": [
        "for m in genai.list_models():\n",
        "  if 'generateContent' in m.supported_generation_methods:\n",
        "    print(m.name)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "m30avD9cfQQ-"
      },
      "outputs": [],
      "source": [
        "model = genai.GenerativeModel('models/gemini-pro')\n",
        "answer = model.generate_content(prompt)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "COBhn6J9S_xI"
      },
      "outputs": [
        {
          "data": {
            "text/markdown": "The Google car, like most modern cars, is equipped with an automatic transmission. This means the car shifts gears based on various sensor inputs and the driving situation.  As the driver, you're not required to manually shift gears.",
            "text/plain": [
              "<IPython.core.display.Markdown object>"
            ]
          },
          "execution_count": 21,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "Markdown(answer.text)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "-ocgBrOEXZbT"
      },
      "source": [
        "## Next steps\n",
        "\n",
        "Check out the [embeddings quickstart](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Embeddings.ipynb) to learn more, and browse the cookbook for more [examples](https://github.com/google-gemini/cookbook/tree/main/examples)."
      ]
    }
  ],
  "metadata": {
    "colab": {
      "name": "Talk_to_documents_with_embeddings.ipynb",
      "toc_visible": true
    },
    "google": {
      "image_path": "/site-assets/images/share.png",
      "keywords": [
        "examples",
        "googleai",
        "samplecode",
        "python",
        "embed"
      ]
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
